dnn_speech_recognition | speech recognition

 by   vysotin Jupyter Notebook Version: Current License: No License

kandi X-RAY | dnn_speech_recognition Summary

kandi X-RAY | dnn_speech_recognition Summary

dnn_speech_recognition is a Jupyter Notebook library. dnn_speech_recognition has no bugs, it has no vulnerabilities and it has low support. You can download it from GitHub.

In this notebook, you will build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline!. We begin by investigating the LibriSpeech dataset that will be used to train and evaluate your models. Your algorithm will first convert any raw audio to feature representations that are commonly used for ASR. You will then move on to building neural networks that can map these audio features to transcribed text. After learning about the basic types of layers that are often used for deep learning-based approaches to ASR, you will engage in your own investigations by creating and testing your own state-of-the-art models. Throughout the notebook, we provide recommended research papers for additional reading and links to GitHub repositories with interesting implementations.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              dnn_speech_recognition has a low active ecosystem.
              It has 0 star(s) with 0 fork(s). There are 1 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              dnn_speech_recognition has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of dnn_speech_recognition is current.

            kandi-Quality Quality

              dnn_speech_recognition has no bugs reported.

            kandi-Security Security

              dnn_speech_recognition has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              dnn_speech_recognition does not have a standard license declared.
              Check the repository for any license declaration and review the terms closely.
              OutlinedDot
              Without a license, all rights are reserved, and you cannot use the library in your applications.

            kandi-Reuse Reuse

              dnn_speech_recognition releases are not available. You will need to build from source code and install.
              Installation instructions, examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of dnn_speech_recognition
            Get all kandi verified functions for this library.

            dnn_speech_recognition Key Features

            No Key Features are available at this moment for dnn_speech_recognition.

            dnn_speech_recognition Examples and Code Snippets

            No Code Snippets are available at this moment for dnn_speech_recognition.

            Community Discussions

            No Community Discussions are available at this moment for dnn_speech_recognition.Refer to stack overflow page for discussions.

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install dnn_speech_recognition

            You should run this project with GPU acceleration for best performance.
            Clone the repository, and navigate to the downloaded folder.
            Create (and activate) a new environment with Python 3.6 and the numpy package. Linux or Mac: conda create --name aind-vui python=3.5 numpy source activate aind-vui Windows: conda create --name aind-vui python=3.5 numpy scipy activate aind-vui
            Install TensorFlow. Option 1: To install TensorFlow with GPU support, follow the guide to install the necessary NVIDIA software on your system. If you are using the Udacity AMI, you can skip this step and only need to install the tensorflow-gpu package: pip install tensorflow-gpu==1.1.0 Option 2: To install TensorFlow with CPU support only, pip install tensorflow==1.1.0
            Install a few pip packages.
            Switch Keras backend to TensorFlow. Linux or Mac: KERAS_BACKEND=tensorflow python -c "from keras import backend" Windows: set KERAS_BACKEND=tensorflow python -c "from keras import backend" NOTE: a Keras/Windows bug may give this error after the first epoch of training model 0: ‘rawunicodeescape’ codec can’t decode bytes in position 54-55: truncated \uXXXX . To fix it: Find the file keras/utils/generic_utils.py that you are using for the capstone project. It should be in your environment under Lib/site-packages . This may vary, but if using miniconda, for example, it might be located at C:/Users/username/Miniconda3/envs/aind-vui/Lib/site-packages/keras/utils. Copy generic_utils.py to OLDgeneric_utils.py just in case you need to restore it. Open the generic_utils.py file and change this code line:marshal.dumps(func.code).decode(‘raw_unicode_escape’)to this code line:marshal.dumps(func.code).replace(b’\’,b’/’).decode(‘raw_unicode_escape’)
            Obtain the libav package. Linux: sudo apt-get install libav-tools Mac: brew install libav Windows: Browse to the Libav website Scroll down to "Windows Nightly and Release Builds" and click on the appropriate link for your system (32-bit or 64-bit). Click nightly-gpl. Download most recent archive file. Extract the file. Move the usr directory to your C: drive. Go back to your terminal window from above. rename C:\usr avconv set PATH=C:\avconv\bin;%PATH%
            Obtain the appropriate subsets of the LibriSpeech dataset, and convert all flac files to wav format. Linux or Mac: wget http://www.openslr.org/resources/12/dev-clean.tar.gz tar -xzvf dev-clean.tar.gz wget http://www.openslr.org/resources/12/test-clean.tar.gz tar -xzvf test-clean.tar.gz mv flac_to_wav.sh LibriSpeech cd LibriSpeech ./flac_to_wav.sh Windows: Download two files (file 1 and file 2) via browser and save in the AIND-VUI-Capstone directory. Extract them with an application that is compatible with tar and gz such as 7-zip or WinZip. Convert the files from your terminal window. move flac_to_wav.sh LibriSpeech cd LibriSpeech powershell ./flac_to_wav.sh
            Create JSON files corresponding to the train and validation datasets.
            Create an IPython kernel for the aind-vui environment. Open the notebook.
            Before running code, change the kernel to match the aind-vui environment by using the drop-down menu. Then, follow the instructions in the notebook.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/vysotin/dnn_speech_recognition.git

          • CLI

            gh repo clone vysotin/dnn_speech_recognition

          • sshUrl

            git@github.com:vysotin/dnn_speech_recognition.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link